- 
                Notifications
    You must be signed in to change notification settings 
- Fork 390
[AutoTuner] Restore resume check #3070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
99ea816    to
    8e18ff7      
    Compare
  
            
          
                tools/AutoTuner/test/resume_check.py
              
                Outdated
          
        
      | folders = glob.glob(os.path.join(experiment_dir, f"variant-*-or-{iteration}")) | ||
| return max((os.path.getmtime(folder) for folder in folders), default=9e99) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little obfuscated:
- The function's name does not match the expected return value type/format. A check should return True/False.
- Returning 9e99 is not clear on what is going on.
- Creating a folder does not confirm the run status. For this test to be true to its purpose, we need to guarantee that it has not finished, not just started.
We should consider using a get_experiment_status function to check if all iterations have finished running; this function would return at least "RUNNING" and "FINISHED", other states might be helpful but are not required now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The function's name does not match the expected return value type/format. A check should return True/False.
Can possibly change it to
get_trial_times
- Returning 9e99 is not clear on what is going on.
It is just a dummy value, to compare for latest modified time in while loop line 121-128
            # Check if first config is complete
            while True:
                cur_modified_time = self.check_trial_times()
                print(f"Current modified time: {cur_modified_time}")
                print(f"Latest modified time: {latest_modified_time}")
                if abs(cur_modified_time - latest_modified_time) < 1e-3:
                    break
                latest_modified_time = cur_modified_time
                time.sleep(10)- Creating a folder does not confirm the run status. For this test to be true to its purpose, we need to guarantee that it has not finished, not just started.
This function returns the latest modified time of a given iteration (folder names are matched using
iterationglob) - so if a run is completed the folder should no longer be modified.
4, We should consider using a get_experiment_status function to check if all iterations have finished running; this function would return at least "RUNNING" and "FINISHED", other states might be helpful but are not required now.
get_experiment_statusis helpful in general, but might not be too useful in resume_check because we need to check iteration completion, as opposed to experiment completion (or all_iterations completion)
Signed-off-by: Jack Luar <jluar@precisioninno.com>
…ror handling Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
8e18ff7    to
    cf5ed3d      
    Compare
  
    Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
This pull request includes several changes to the
AutoTunertests to re-enable previously disabled tests, improve the test setup, and add new utility methods.Fixes #3005
Re-enabling tests and improving setup:
flow/test/test_autotuner.sh: Re-enabled thetest_tune_resumetest in theAutoTunerscript by uncommenting the test execution line.Codebase simplification and improvements:
tools/AutoTuner/test/resume_check.py: Removed unnecessary directory changes and imported theglobmodule to facilitate file pattern matching. [1] [2]tools/AutoTuner/test/resume_check.py: Introduced thecheck_trial_timesmethod to check the modification times of trial iterations, improving the robustness of thetest_tune_resumetest.